arXiv:l 502.07199vl [stat.AP] 25 Feb 2015 


Mortality Models based on the Transform log(- logx) 


Meitner Cadena* 


Abstract 

A new stochastic method for describing mortality is proposed and explored. It is based on 
differences of observed times series of the transform log(- log x) of survival probabilities which 
seem to follow simple patterns over the years. These common structures are gathered by a 
representation based on age-constants and time-stochastic processes. From the projection of 
the time-processes the mortality forecasting is straigthforward. Comparisons of the new model 
with the well-known Lee-Carter and Cairns-Blake-Dowd models employing sex-based mortal¬ 
ity data of some countries are provided. Some in-sample and out-of-sample goodness-of-fit 
criteria show that in some situations the new model performs better than the ones mentioned 
above. Assessments of the performance of these models using rates of mortality improvement 
are discussed. 

Keywords: stochastic mortality model, survival function, survival function transform, Lee- 
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1 Introduction 

Over the last decades, general populations have known sharp improvements in their mortality. 
These movements for all ages have been largely attributed to the technical development in public 
health and improvements in socio-economic conditions (see e.g. @1, [HI and [26]). This evolu¬ 
tionary phenomenon has impacted domains that are critical for the functioning of societies, as 
for instance planning for social security and health care systems and for the funding of retirement 
income systems, and employment and education organization for increasing older populations. 
Hence adequate descriptions of the future mortality are needed. 

Growing efforts to provide appropriate mortality forecasts have been known after mainly the sem¬ 
inal paper of Lee and Carter Ii7: l in 1992. They introduced a stochastic approach to forecast mor¬ 
tality of a population using age and calendar year. Their innovative proposal is still attractive and 
used extensively because of reasonable fits for most Western countries and simplicity and ease 
of use in practice, for instance the Lee-Carter methodology is used as a benchmark methodology 
by the US Bureau of the Census. Moreover, the Lee and Carter model has motivated to further 
authors the development of extensions and variants, for instance Brouhns et al. [6], Li and Lee 
El , Renshaw and Haberman ( 23 ], and Hyndman and UllahjTs]. Other models have been also de¬ 
veloped to deal with the challenges that future mortality presents. Surveys of techniques dealing 
with mortality projections are presented in e.g. |21], E3l . and [4]. 

Literature shows that some of those models have been based on transforms of survival functions. 
This type of models consists in transforms that are applied to survival functions in such a way that 
their outcomes have much simpler dynamics than those of survival functions. In this way, these 
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models may describe an important part of the mortality data variability over fitting periods as 
well as over forecasting periods. Among well-known models of this type are the relational method 
based on the logit transform introduced by Brass in 1974 [S], and, more recently, inspired by the 
Wang transform, the model proposed by de Jong and Marshall in 2007 0 which describes shifts 
of z-scores. In these two models the transforms of survival functions are represented by principally 
time parameters. 

Let us see closer these models based on transforms of survival functions, which are related to our 
new model. Brass proposed the model (see [S] and e.g. |22|) 

logit(S(x, f)) = a t +/3 t logit(S(x,r)), t>T, (1) 

where S(x, t) is a survival function at age x and in year t, logit(x) = log (x/(l - x)), 0 < x < 1, is the 
logit transform of x, log(x) represents the natural logarithm of x, a and /) are constants, and r is a 
given year. On the other side, de Jong and Marshall proposed a model based on z-scores given by 
(see Edl) 

z x ,t — < I )-1 (S(x, t)) - R' x p + a t +e X}t , da t = Xd + crdb t , 

where the first part contains the standard normal distribution <J> and is a linear regression model 
with a known regressor vector R x , (> an unknown regression parameter vector, a, a value to be 
defined in the second part, and e x ,t zero mean measurement errors, and the second part models 
a t as a Wiener process with drift A and variance a 2 [a > 0). The Wang transform <T>(<1> —1 (x) + a) is 
identified when these authors rewrite the continuous model in its discrete version and introduce 
their observation that the shifts of the z-scores (obtained from the application of the inverse of the 
standard normal distribution to survival functions) between successive years seem to be constant 
over time, which gives 

S(i, n + k) - ( J , (Zi,n + A k) — <J>(<t> -1 (S(i, n)) + A k). 

The following rewriting of the last expression is of interest for us: 

<J> -1 (S(i, n + k)) = <J> -1 (S(i, n)) + A k. (2) 

Note that CD and m are relationships among transforms of survival functions. Also, the dynamics 
of these models in terms of time are reduced to the dynamics of the parameters a t and /3 f , and A k 
respectively. 

We consider in this article the application of the transform L(x) = log(-logx), 0 < x < 1, to sur¬ 
vival functions for representing and forecasting mortality. To this aim, the differences of these 
applications of L between any year and another given year are modeled by using time- and age- 
parameters. The resulting relationship is like Q} or and thus the mortality forecasting is re¬ 
duced to the forecasting of the time-parameters. Hence mortality forecasting is straightforward. 

This paper is organized as follows. The application of L to survival functions is examined in Sec¬ 
tion El Section El presents the formulation of our new stochastic mortality model based on L. 
Section[4]presents numerical illustrations of applications of the new model to sex-based mortality 
data of some countries. Comparisons of these models with the well-known Lee-Carter (LC) and 
Cairns-Blake-Dowd (CBD) models using in-sample and out-of-sample goodness-of-fit criteria of 
mortality projections are provided. Additionally, assessments of the performance of these mod¬ 
els, using rates of mortality improvement, are discussed. The last section presents conclusions 
and next steps of future research with the new proposed model. 


2 Examination of the application of L to survival functions 

In this paper survival functions are intensively used. We first present some ways to obtain these 
functions since they are not available in life tables. To this aim, we will use the probability that an 
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individual aged x in year t does not reach x+ 1, q xt , and the central death rate, m X}t . For simplicity, 
we suppress the explicit dependence on t in all notations when no possible confusion. 

On the one side, for a given age xq, the survival function S(x) associated to xo is based on the 
probabilities that someone aged exactly xq will survive for x - xo more years, then die within the 
following 1 year. These probabilities are classically written as x -x 0 \i Qx 0 and for ease of notation we 
write them as r x . In the actuarial literature, the curve of r x , x - xq, xo + 1,..., is called the curve of 
deaths. 


r x and q x are related by: 


and S(x) corresponds to 


I Qx 

rx= \ Qx x u-: xo (i - Qi) 


X = Xo 
X > Xo, 


S(x)=£r fc , x>x 0 . 

k>x 


It is not hard to see that ^ = 1 and S(x) = ]~[ (1 - 

k>x i=x o 

can compute q x using 

1 r * 

Qx = \ - Ja - 

( njL^ll ~qt) 


qi), x > xq. Conversely, from r^ or S(x) one 


x = Xo 
x > Xq, 


or q x = r x /S{x- 1) = 1 - S(x)/S(x- 1), x> x 0 , recalling that r Xo = q Xg . 


On the other side, throughout this paper, we assume that the force of mortality, defined by /i(x) = 
-S'(x)/S(x), is constant within bands of age and time, but allowed to vary from one band to the 
next, i.e. 

Ht{x) - ii t+e {x +8) for all 0<e,<5<l. (3) 


This implies that /i f (x) = m x ,t and e^^ ,<X! = 1 - q x t . Combining these relationships gives e~ mx ’ r - 
1 - q x ,t, which allows the computation of survival probabilities from m Xil using the previous rela¬ 
tionships between r x , q x , and S(x). 

Assumption {3) is often used in mortality data analyses, see e.g. [6] and |8]. 


Let us see computations of survival probabilities using observed central death rates. Estimates 
of r X: i and S(x, t), taking xq — 55, for French females for some ages and years, calculated using 
historical data of m X} t obtained from Human Mortality Database jl3j], are presented in Figure[l] In 
both panels of this figure there are sharp shifts of the curves towards the right. This is an evidence 
of mortality improvements over years. However, these movements do not happen always in the 
same way. The sequence of the curves of deaths reveals that these curves are increasingly narrower 
and thus their heights tend to be higher. Moreover, the life span for French females shows no 
sign of approaching towards a fixed limit, since the last survival probability of the curve of deaths 
presented in Figure [T] tends to increase when time increases. All of this shows that the dynamics 
of r x j and S(x, t) are complex and their examinations thus need strategies to appropiately dissect 
their operations. 


In this paper we will use the transform L to represent mortality dynamics. To this aim, let us first 
see how this transform works when it is applied to survival functions, using the example intro¬ 
duced above. Figure[2]shows the results of the application of L to the survival functions exhibited 
in Figure Q] The resulting curves are quite smooth for most of ages with respect to the survival 
functions shown in Figure [1] excepting for ages near xo. Showing initially a concave shape, it 
seems, for the age period 70 - 94, that the transformed curves become right linear when time in¬ 
creases. Moreover, they present downward shifts between the years considered, but not in a con¬ 
stant way and also not holding always the same curve shapes. Furthermore, when age approaches 
towards the lowest or the highest studied ages then the curves are increasingly close each other. 
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(a) Curve of deaths ( r x ,t ) 


(b) Survival function (S t (*)) 


Figure 1: Survival curves for French females, xq — 55 



Figure 2: Application of L to survival functions for French females, xq — 55 


Next, we formulate relationships like jTJ or {2} in terms of L. We consider the difference of transfor¬ 
mations of survival functions computed for years t and to where to is a given year, i.e. L(Si (x)) - 
h(St 0 (x)). Using the data introduced above, Figure [3] shows the outputs of these computations 
when the year of reference is 1950. We now have only three curves since the curve corresponding 
to 1950 coincides with the horizontal axis and it then is not useful in our analysis. They are well- 
distinguished and well-separated among them with respect to those exhibited in Figure [2] and 
show concave shapes increasingly more pronounced and downward shifts when time increases. 
These curves do not have unstable behaviors at high ages, i.e. near 94 years old. It is in advantage 
to model those ends of curves because the structure to describe is simple. Furthermore, the trans¬ 
formation L(Sf (x)) -L(Sf 0 (x)) seems a promising method to study very high ages, e.g. from 90 and 
over, since the trend of its outcomes seems to be linear. 
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Age 


Figure 3: Differences of L(S(xj) with respect to 1950 for French females, xq = 55 


3 New stochastic mortality models based on L 

3.1 New model 

We aim to model curves like those presented in Figure [3] For this purpose, given the non-linear 
nature of the differences of L(S(x)), we borrow representations of mortality like those of James 
and Segal in El or Cairns et al. in |7] and use them to describe such differences. The structures of 
the methods of these authors are ease to implement and to interpret. In this paper we adopt the 
representation given by Cairns et al. and allow greater flexibility to age-parameters. Our model, 
which we called the SL model, for representing and forecasting mortality is, given xq , x rtl j n = xq, 
X max, to, and f 111 jn — f() + 1, 

L[St (x)j — T(Sf 0 (xj) = (X\ t t + &2,tKx, x min — x — X max, t ^ t m [ n , (4) 

where k x are age-specific constants and, a\ :t and « 2 ,/ are stochastic processes that are assumed 
to be measurable at time t. Note that k x varies linearly in |"71, but in j4) it could vary non-linearly. 

To project L(S f (x)) - L(Sf 0 (x)), we adopt for the two-dimensional time series a t = (ai jt , a 2 j)' the 
dynamics given for the two-dimensional stochastic parameter vector from the CBD model |[7j], i.e. 
a t is modeled by the following two-dimensional random walk with drift: 

a t+ i = a t + a + AZ r+ i, (5) 

where a is a constant 2x1 vector, A is a deterministic 2x2 upper triangular matrix, and Z t is a 
two-dimensional standard normal random variable. 

Note that from projections of L(S<Tx)) - L(S f 0 (x)) one can easily compute any mortality variable 
using the relationships given in Section[2] say for instance survival probabilities, q Xit , and m X}t . 

Next, vaying to we get extensions of J4), i.e. to can be considered as another parameter more in £j] 
and also xo- Note that, fixing f m ; n , to can be chosen smaller than f m i n - 1. Some of these features 
will be analyzed and exploited in the applications to be presented later. 

Let us see numerical illustrations of L(Sf (x)) - L(St 0 (x)) when to varies. Figure 0] shows curves like 
those exhibited in Figure[3]for French females but now for to = 1951, 1952, 1953, and 1954. Note 
that if to reaches 1955, then the curve corresponding to f = 1955 does not have interest in the 
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(a) 1952 (b) 1953 (c) 1954 (d) 1955 

Figure 4: Differences of L(S(x)) with respect to several years for French females, xq - 55 


analysis. These new curves are lightly different among them, and it seems that all of them move 
upward when to increases. 

The corresponding curves of Figure [4] for French males are presented in Figure [5] They are a bit 
different from the French female curves. Now the lower curves hold concave shapes, but the upper 
curves are not always concave and present convex pieces. 



Age Age Age Age 


(a) 1952 (b) 1953 (c) 1954 (d) 1955 

Figure 5: Differences of L(S(x)) with respect to several years for French males, xq = 55 


3.2 OLS estimation 

The model 0 ), given to and xo, is fitted to computed values of L(St (x)) - L(Sf 0 (x)) using ordinary 
least squares (OLS). It consists in to minimize 

£ [L(S t {x)) -L[S to {x)) - ai it - a 2 , t Kx) 2 ■ (6) 

X,t 

As noticed by Brouhns et al. | 6 ||, models like 0) are not simple regression models since there are 
no observed covariates in the right-hand side. Hence alternative ways to find values of a\ tt , 0 : 2 , 1 . 
and k x minimizing | 6 ) are required. We notice that the uni-dimensional or elementary Newton 
method proposed by Goodman ITi l can be used in this case. This method works like the itera¬ 
tive Newton method but considering in each iteration one parameter at a time. In this way each 
parameter is updated in the iteration 1c before the iteration of the next parameter in the same it¬ 
eration k. Iteration moves from A; to k+ 1 when all the parameters have iterated in the iteration 
k. This process continues until to find the convergence of the sequence of parameter estimates or 
another stopping condition. 

The fitting of 0) to data showed that some parameters may require slow convergences. It was 
implemented by reducing the variations among iterations, for instance multiplying such varia¬ 
tions by a factor y < 1. This is taken into account in the following iteration algorithm designed to 
minimize (£}: 
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1. Initializations: 


(a) Initialization of a\ :t , ct 2 ,t> and k x : af\, af\, and k®\ 

A strategy to obtain these initial values is to fix k x using a known function of x. Then 
one can find estimates &i t and & 2 ,t applying linear regressions of L(S f (x)) - L(Sf 0 (x)) 
on k x for each t. 

(b) Set the maximal number of iterations A; max . 

(c) Set k=\. 


2. While not convergent and k < fc max do: 

(a) For k do: 

d® = «u 11 + 7E ( L (StW) - L (S f0 M) - a[ k ; 1] - & {k ~ l) k {k ~ l) \, 

Et *? _1) (iCS f (jc)3 - L[S t0 M) - a® - d®“ 1) k' c fc_1) ) 


& 2? t = & 2 k t 1)+ r 


*? = **-»+r 


Lx&fJ (i(St(x)) -I(s t0 (x)) - d® - d®*?- 15 ) 


; and, 


(b) k-k+1. 


The criterion of convergence is to reach | v ® - v 1] | < c, being v any of a\ :t , ct 2 ,t, or k x , for a rea¬ 
sonably small value of e. When this stopping criterion is satisfied, the OLS estimates of a\ it , « 2 ,t. 
and k x using Newton’s method, di jf , d2,f, and k x , are the final iterative estimates d®, d®, and 
ic®. In applications to be presented later e = 10 -8 . On /c rnax , the maximal number of iterations, we 
will fix it in 5,000. 


4 Numerical illustrations 

At present we apply the LS model to mortality data of seven industrialized countries and compare 
their mortality projections with those of the LC and CBD models. We will examine q x ,t at higher 
ages (60 - 94) using in-sample and out-of-sample goodness-of-fit criteria and considering some 
age periods. 

To this aim, we start with the description of data to be considered and the presentation of the 
information to be modeled. Next, the LC and CBD models are briefly described. In the last part 
the results are shown and discussed. 


4.1 Data 

In this paper we use sex-based mortality data from populations of Belgium (BE), France (FR), Italy 
(IT), Japan (JP), Sweden (SW), United Kingdom (UK), and United States (US). Data, obtained from 
Human Mortality Database (HMD) 0 , correspond to ages from 60 to 94, and to years from 1960 
to 2009. No adjustment is made to the data. 

A typical dataset consists of the observed variable D x t and the computed variable R x , t over the 
above-mentioned ranges of ages and years. D x ,t is the number of deaths during calendar year 
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t aged x last birthday and E Xit represents an estimated average, during calendar year t, of the 
number of people alive who were aged x last birthday. 

With these variables m x l is estimated by (see e.g. |27:|) 

D x ,t 
m x ,t - ——, 

E x , t 

and q Xit and S t (x) can then be computed using the relationships presented in previous Section. 
Recall that Sf(x) is involved in the LS model, and we will see that m x ,t and q xt are taken into 
account in the LC and CBD models. 

4.2 Benchmark models 

We compare mortality projections of the SL model with those of the LC and CBD models. These 
two last models are taken as benchmark models since they are well-known and extensively used. 
Brief descriptions of these models follow. 


4.2.1 Lee-Carter model 

This model proposed by Lee and Carter in 0 is given by 

log [m Xit ) = a x + p x K t , (7) 

where a x and f} x are specific constants at age x, and k t is a time-varying index. 

Since the parameter estimation for the LC model is not unique, the following two constraints are 
adopted: 

2 > = 0 and £/3* = l. 

t X 

For projecting mortality the resulting estimate of the parameter k t is projected as a stochastic time 
series using standard Box-fenkins methods. 

To apply this model in our research we use the function LCA provided by the library Demography 
implemented in the language R (see [l4]). This function estimates the parameters a x , fi x , and x t 
using singular value decomposition, following the methodology proposed by Lee and Carter. 


4.2.2 Cairns-Blake-Dowd model 

This two-factor stochastic mortality model was introduced by @1. It is given at age x, xe [x\,X 2 \, 
and time t by 


logit = K i,t + K 2,t ix — x), ( 8 ) 

where x\ it and X 2 ,t are stochastic processes that are assumed to be measurable at time t, and x is 
the age mean of the range of ages analyzed. The two dimensional time-series K t = {xi,t,K2,tY is 
modeled as in 0. 

Following |7j, for each t, K\ t and K 2 ,t are estimated using least squares by transforming q x to 
logit ( q X: t ) = ki, t + k 2 , t (x — x) + error. 


4.3 Analysis of results 


In this subsection we compare the in-sample and out-of-sample predictive abilities of the LS, LC, 
and CBD models with respect to data of sex-based q x ,t of seven countries. We examine this mor¬ 
tality variable since it is commonly used in practice, for instance to build mortality tables and to 
compute annuities. 

We focus on mortality at higher ages because it is related to population ageing. Age periods from 
x m i n to x max are analyzed taking x m j n = 60, 65 and x max = 89, 94. 

Observed mortality from 1960 to 2009 is used to evaluate in-sample and out-of-sample mortality 
projection accuracies. This year period is divided in two, the first, from f m j n = 1960 to f max = 1989, 
to fit the models to data and the second, from 1990 to 2009, to forecast mortality. For the LS model 
we fix to = tmin - 1 in J4}. 

We examine both in-sample and out-of-sample projection accuracies of q x ,t employing a couple 
of goodness-of-fit criteria often used to evaluate projection performance (see e.g. jl8i| and |9|]). 
The first criterion is the mean squared error (MSE) defined as 

MSE=i£ [Xi-Xif 

n ti 

where X,- and Xj are the observed and estimated values of mortality in fitting or forecasting sam¬ 
ples, and the second criterion is the mean absolute percentage error (MAPE) defined as 

i » \Xi-xA 

MAPE =-Y - —- x 100 %. 

n ti X t 

A smaller MSE or MAPE value indicates a better fit to the data on a given period. However, the 
lowest values of these indexes are not associated to the same model necessarily. We will come 
back on this issue later. 

For each combination of the values of x m ; n and x max mentioned above, we present values of MSE 
and MAPE in Tables[l][2][3] and[4] Each of these tables displays countries in rows and sex categories 
in columns. For each country, the CBD, LC, and LS models are nested. Values of MSE and MAPE 
are shown for fitting and forecasting year periods, nested in each category of sex. Additionally, 
in each combination of country and sex and fitting / forecasting year period, the lowest values of 
MSE and MAPE among the models studied are highlighted, which allow the identification of the 
models with the best fits. Besides, the values of MSE are displayed with two decimals and those 
of MAPE with one decimal, this may produce some confusion if a highlighted value equals other 
values, but there is no conflict since such selection was done using more decimals than those 
displayed. 

A general finding through all these tables is that, as expected, the values of MSE or MAPE on fore¬ 
casting periods are always much higher than their corresponding values on fitting periods. Also, 
as indicated above, for fitting as well as for forecasting periods, not always the model with the 
highest value of MSE is the model with the highest value of MAPE. Additionally, the distributions 
of the highlighted models are frequently different from a table to another, but in a few cases, for a 
pair of tables and a given country, one can find the same selection of models. Another finding is 
that the values of MSE or MAPE, over fitting or forecasting periods, are in general higher for males 
than for females, which suggests that the modelization of mortality is more difficult for males than 
for females. 

On the fitting period, for females, the highlighted values are concentrated mainly on the LC model 
and a few cases on the LS model, and none of these values is associated to the CBD model. An 
interesting feature of the LC model is that it is highlighted almost everywhere when x max is 94. 
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Female Male 


Country 

Model 

Fitt. Period 

Frcs. Period 

Fitt. Period 

Frcs. Period 

MSE* 

MAPE 

MSE* 

MAPE 

MSE* 

MAPE 

MSE* 

MAPE 

BE 

CBD 

0.06 

3.2 

0.12 

7.8 

0.17 

3.0 

1.02 

11.3 


LC 

0.06 

2.5 

0.09 

4.9 

0.19 

3.0 

1.0 

11.8 


SL 

0.05 

2.7 

0.15 

6.5 

0.40 

3.5 

1.2 

13.5 

FR 

CBD 

0.06 

4.2 

0.14 

9.9 

0.10 

3.1 

0.35 

8.7 


LC 

0.02 

1.9 

0.11 

7.8 

0.09 

2.2 

0.4 

7.2 


SL 

0.02 

1.8 

0.09 

5.8 

0.13 

2.5 

0.4 

8.4 

IT 

CBD 

0.05 

3.1 

0.11 

6.7 

0.08 

2.6 

0.42 

12.1 


LC 

0.04 

2.1 

0.11 

4.6 

0.10 

2.4 

0.6 

11.9 


SL 

0.04 

2.1 

0.13 

6.0 

0.13 

2.4 

0.4 

12.6 

IP 

CBD 

0.06 

3.1 

0.07 

11.3 

0.09 

2.1 

0.13 

9.7 


LC 

0.04 

1.9 

0.28 

13.9 

0.06 

1.8 

0.1 

11.3 


SL 

0.05 

1.9 

0.07 

9.0 

0.09 

2.0 

0.6 

10.1 

SW 

CBD 

0.09 

3.8 

0.46 

10.2 

0.08 

2.4 

0.30 

9.8 


LC 

0.05 

2.8 

0.19 

8.0 

0.11 

2.4 

0.4 

10.4 


SL 

0.11 

3.3 

0.32 

8.6 

0.23 

3.0 

0.4 

10.7 

UK 

CBD 

0.04 

2.5 

0.16 

10.4 

0.07 

2.1 

1.74 

15.3 


LC 

0.02 

1.5 

0.11 

9.7 

0.08 

1.9 

1.4 

15.2 


SL 

0.02 

1.8 

0.08 

8.5 

0.12 

2.1 

0.9 

13.3 

US 

CBD 

0.07 

4.2 

0.59 

6.8 

0.05 

2.1 

0.28 

7.3 


LC 

0.01 

1.6 

0.31 

6.6 

0.04 

1.7 

0.3 

7.5 


SL 

0.02 

2.0 

0.24 

5.9 

0.06 

2.0 

0.4 

7.9 


MSE* = 10,000 x MSE. 


Table 1: MSE and MAPE of q x ,t for x m j n = 60 and x max = 89 


Regarding males, the highlighted models are concentrated mainly on the CBD and LC models and 
a few cases on the LS model. Moreover, no model is predominant through all values of x m ; n and 

Xmax ■ 

On the forecasting period, a first observation is that not always the best model over the fitting pe¬ 
riod corresponds to the best model over the forecasting period. This implies that the distribution 
of the highlighted values varies between these two periods. On females, we have a sharp presence 
of the SL model, excepting the case x m ; n = 60 and x max = 94. It performs better than the benchmark 
models for US and UK, and gives the best mortality projection accuracies for JP and SW when x m ; n 
= 65 and x max is 89 or 94. Elighlighted values associated to the LC and CDB models appear in a few 
times. On males, there is a strong presence of the CBD model for an important number of coun¬ 
tries through all values of x m ; n and x max , mainly BE, NE, SW, and US. The LC and SL models are 
found in a few cases. Nevertheless the scarce selection of the SL model as one of the best models 
for males, it gives the best mortality projection accuracies for UK for any value of x m ; n or x max . 

The values of MSE and MAPE allow the identification of the best model among the LC, CBD, and 
LS models, but without any assessment on how well these models fit, i.e. is any of these models 
acceptable? For answering this question, we focus, for a given age x, on the rate of mortality 
improvement (MI) over the forecasting year period. This rate indicates how mortality rates change 
with respect to mortality rates of a particular group at a specific point in time. 

Rates of MI are often applied to initial mortality levels to establish generational tables to obtain 
mortality estimates at any future point in time (see e.g. [ici). These rates are also analyzed in 
function of economical and medical factors to explore their dynamics (see e.g. [25|]). In this paper 
rates of MI are used to assess the performance of the models. 

The definition of the rate of MI is not unique (see e.g. (I] and [ 2 ^]). These rates may be computed 
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Female Male 

Country Model Fitt. Period Frcs. Period Fitt. Period Frcs. Period 

MSE* MAPE MSE* MAPE MSE* MAPE MSE* MAPE 

BE CBD 007 2A 009 5 3 0J7 2?7 L17 12.1 

LC 0.07 2.3 0.10 4.5 0.23 2.8 1.2 12.6 

SL 0.07 2.5 0.16 6.1 0.59 3.8 1.7 14.2 

FR CBD 003 22 007 0 8 006 2d 030 T2~ 

LC | 0.03 1.7 0.13 7.2 0.10 2.0 0.4 6.8 

SL 0.03 \ 1.6 0.09 5.3 0.09 1.9 0.4 7.9 

Tf CBD 006 23 008 03 007 2d 046 10.3 

LC | 0.04 2.0 0.13 § 4.7 | 0.12 2.3 0.6 10.3 

SL 0.05 2.0 0.17 6.9 0.07 2.1 0.6 11.1 

TP CBD 007 2 2 014 9d OdO L8 Odl 8/7~ 

LC 0.04 1.8 0.32 11.3 0.07 1.5 0.1 9.3 

SL 0.04 ^ 1.6 0.10 7.9 0.07 1.5 0.2f 8.7 

SW CBD OdO 3d) 036 05 009 2 2 035 9T~ 

LC | 0.06 2.5 0.22 7.8 0.13 2.3 0.5 9.7 

SL 0.12 3.0 0.19 7.8 0.14 2.5 0.5 10.3 

UK CBD 003 L9 015 03 006 L6 L98 14.8 

LC | 0.02 1.4 0.13 7.5 0.09 1.7 1.7 14.6 

SL 0.03 1.6 0.09 6.8 0.11 2.0 1.2 13.1 

US CBD 004 23 057 7.4 005 L9 031 7T” 

LC 0.01 1.5 0.36 7.1 0.05 1.7 0.4 7.6 

SL 0.03 1.9 | 0.30 6.5 0.08 2.0 0.5 8.3 

MSE* = 10,000 x MSE. 


Table 2: MSE and MAPE of q x j for x m j n = 65 and x max = 89 


as arithmetic or logarithmic rates, they may be based on q X}t or on m xl , and they may be calcu¬ 
lated with respect to a given year or to a previous year. We compute annual logarithmic rates of 
MI with respect to 1989 by, for x — 65 and considering x m i n = 65 and x max = 94, for t > 1989, 

A 6 5 ,r = — log f C,65 ’ t ) x 100, 

V £?65,1989 I 

where £765,1989 is the value of q^ }t observed in 1989. Note that this definition can be seen as the 
cumulative rate of MI in year t with respect to 1989. This rate is expected to be increasing over the 
years since in the last decades q x t has shown downward trends. 

Figure[6]shows the observed and projected rates of MI obtained for females and for each country. 
The corresponding rates for males are presented in Figure[7] Through all these figures the curves 
have upward trends with variations from one year to the next, which is an empirical evidence of 
the uncertainty of the rates involved. The dynamics of these curves are related to the main concern 
of the models applied in this article: each time that the curves of projected and observed rates are 
more separated, the model risk may be higher. In the worst cases the distances between those 
curves may be systematic and increasing. 

On females, Figure[6] the observed rates for IT and US present little variation, and are overlapped 
by predicted rates given by the CBD and LC models for IT and the LC and SL models for US. On 
BE, the forecasted rates of the CBD and SL models tend to overlap the observed ones, whereas 
the forecasted rates of the LC model lightly overestimates the observed ones. For FR, JP, SW, and 
UK, the observed rates are hard to represent by the benchmark and new models since systematic 
differences between observed and predicted rates are found. 

The behaviors of the forecasted rates for males, Figure[7] are quite different from those for females, 
Figure[6] On males, we now have through all the countries studied in this paper (excepting JP) that 
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Female Male 

Country Model Fitt. Period Frcs. Period Fitt. Period Frcs. Period 

MSE* MAPE MSE* MAPE MSE* MAPE MSE* MAPE 

BE CBD 034 06 020 73 048 33 L03 10.0 

LC 0.23 2.6 0.26 5.0 0.55 3.2 1.9 11.8 

SL 0.25 2.9 0.41 6.9 0.98 3.9 2.6 14.0 

FR CBD 015 4J 019 08 018 3d 069 8/7~ 

LC 0.06 1.9 0.20 7.1 0.21 2.3 0.5 6.6 

SL | 0.05 1.8 0.21 6.1 0.40 2.5 1.3 9.1 

Tf CBD 026 05 019 05 017 06 061 11.1 

LC 0.11 2.1 0.15 4.3 | 0.25 2.5 0.8 11.2 

SL 0.09 2.1 0.39 6.0 0.21 2.4 0.8 11.7 

IP CBD 028 0 5 023 1L1 045 OO 023 fuT 

LC 0.11 2.0 0.68 13.4 0.20 1.9 0.2 10.2 

SL 0.17 ^ 2.0 0.23 9.3 | | 0.20 2.0 1.1 9.1 

SW CBD 043 43 L20 104 039 2/7 061 <uT 

LC 0.20 2.9 0.71 7.9 0.50 2.8 1.6 10.8 

SL 0.29 3.3 1.15 9.0 0.72 3.2 2.0 11.2 

UK CBD Oil 2/7 030 9/7 014 23 L84 13.8 

LC | 0.05 1.6 0.16 8.8 0.16 2.0 1.3 13.6 

SL 0.09 1.9 0.15 7.8 0.19 2.1 1.1 12.5 

US CBD Oil i! L73 8T 007 23 057 73~ 

LC 0.02 1.6 0.93 I 7.4 0.08 1.7 0.8 7.8 

SL 0.05 2.0 0.95 § 6.9 | 0.09 2.0 0.8 8.0 

MSE* = 10,000 x MSE. 


Table 3: MSE and MAPE of q x j for x m j n = 60 and x max = 94 


the forecasted rates tend to underestimate the observed rates. The best forecasted rates are found 
for FR and US, with the LC model for FR and the LS model for US. For the other countries the 
forecasted rates are away from the observed rates, these differences growing when time increases. 

The projected and observed curves shown in Figures[6]and[7]are in general more separated in the 
last years of the forecasting period. Let us examine the differences among these curves in the two 
last years, 2008 and 2009, computing the next index of type MAPE. 


MAPE a 


65 , 2008,2009 


1 

2 


|^65 2008 “ ^65,2008 
, ^ 65,2008 


1^65,2009 “ A65,2009 | 


^65,2009 


X 100 %. 


Results of this index are shown in Table[5] They are organized by country, sex, and x m ; n and x max , 
so they contain the cases considered in Figures [6] and [7] The lowest values of MAPEa 65 20 os 2009 
among the models studied are highlighted. Smaller highlighted values indicate a better forecast of 
the data of 2008 and 2009, and a few of them are found only, for instance, considering those near 
10.0 % or less, we have IT and US females and FR and US males, all of them representing 21.4 % of 
the cases. Besides, the SL model seems to be appropriate for mortality forecasting for US females 
aged 65 and over and for US males aged 60 and over. For US females aged 60 and over the forecasts 
given by the LS model seem also acceptable. 


5 Conclusion 


A new model for stochastic mortality projection based on the application of the transform log (- log x) 
to survival functions was proposed. This transformation was represented by specific-age param- 
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Female Male 

Country Model Fitt. Period Frcs. Period Fitt. Period Frcs. Period 

MSE* MAPE MSE* MAPE MSE* MAPE MSE* MAPE 

BE CBD 041 02 021 54) 052 34) L17 10.5 

LC 027 2.5 031 4.8 0.64 3.0 2.2 12.3 

SL 0.30 2.7 0.45 6.6 1.03 4.0 3.2 13.4 

FR CBD 021 23 016 05 016 2d 045 OfT 

LC 0.07 1.7 0.22 6.5 0.24 2.1 0.5 6.2 

SL 0.06 1.6 0.22 5.9 0.46 2.3 1.7 8.9 

Tf CBD 033 02 014 OO 018 23 060 03~” 

LC 0.12 2.0 0.17 § 4.3 | 0.29 2.4 1.0 9.8 

SL 0.12 2.0 0.54 6.9 0.21 2.2 0.8 10.0 

TP CBD 034 3d 051 06 046 23 022 fuT 

LC 0.12 1.9 0.75 11.0 0.22 1.7 0.2 8.3 

SL 0.16 1.7 0.27 8.1 0.20 1.7 0.6 7.9 

SW CBD 048 0 8 095 0 6 043 2 3 070 05~ 

LC | 0.23 2.7 0.80 7.8 0.58 2.7 2.0 10.3 

SL 0.27 3.0 0.63 7.6 0.81 3.1 1.9 10.5 

UK CBD 012 Z2 025 7?7 013 L8 003 13.2 

LC | 0.06 1.5 0.18 6.8 0.18 1.8 1.6 12.8 

SL 0.13 1.8 0.15 6.1 0.19 2.0 1.3 11.8 

US CBD 007 08 L61 03 007 L8 059 OfT 

LC 0.02 1.4 1.07 7.9 0.09 1.7 1.0 8.0 

SL 0.09 2.0 0.98 7.5 0.12 2.0 1.1 8.5 

MSE* = 10,000 x MSE. 


Table 4: MSE and MAPE of q Xit for x m j n = 65 and x max = 94 


eters and stochastic processes depending on time. This model has a structure like those given in 
d] and QdJ . Mortality forecasting was obtained from the projection of the time-processes. 

According to some goodness-of-fit criteria the application of this model to sex-based mortality 
in-sample and out-of-sample data from seven countries showed that in some cases it overper¬ 
forms two well-known stochastic mortality models. These findings were sharp for females over the 
studied forecasting year period. These global assessments where complemented with valuations 
per year of the rates of MI. These last results gave appraisals of the mortality forecasting quality, 
showing that in a few cases forecasts of the rates of MI may be acceptable. In many cases these 
mortality projections were away from the observed mortality. These last findings corroborated 
the challenge that future mortality presents and showed the “mortality gap” between predicted 
and observed mortality that the new modelizations should mitigate. For this aim, our new model 
seems to provide fundamental and enduring features of mortality patterns to deal with this mor¬ 
tality gap in some cases, these features being based on a reference year and on its relationships 
with subsequent years. Hence, this new model seems a promising approach to give more accurate 
projections of mortality. 

In many cases, the observed mortality gaps seem to tend to grow over years. These model failures 
have been evidenced in the literature and they may produce longevity risk that may be refected 
in financial losses (see e.g. (24,]). Hence, the information provided by the new model may benefit 
practitioners in their efforts to reduce longevity risk in pricing and valuation of products involving 
longevity. A wide survey on impacts of longevity risk in pension funds and annuity providers is 
found in jjdj. 

Another interesting result of our analysis is that the correspondence among the best models over 
both fitting and forecasting periods, i.e. that a same model be the best one over fitting and fore- 
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Country 

•^min 

-^max 


Female 




Male 


CBD 

LC 

SL 

CBD 

LC 

SL 

BE 

60 

89 

50.7 

61.5 

40.8 

40.2 

34.9 

51.4 



94 

54.6 

57.6 

40.6 

42.5 

36.8 

52.9 


65 

89 

87.3 

61.2 

43.1 

47.8 

39.2 

52.7 



94 

89.0 

57.0 

42.8 

51.7 

41.1 

63.7 

FR 

60 

89 

42.8 

52.5 

27.9 

31.6 

19.5 

35.6 



94 

49.1 

51.2 

28.2 

28.2 

19.4 

33.2 


65 

89 

92.6 

53.8 

31.4 


5.7 

17.0 

35.1 



94 

97.2 

52.2 

30.5 


3.1 

17.0 

33.8 

IT 

60 

89 

8.6 

5.7 

15.7 

63.6 

53.4 

67.8 



94 

6.3 

5.4 

17.3 

64.1 

53.4 

70.5 


65 

89 

7.3 

5.7 

17.1 

59.3 

54.0 

73.8 



94 

9.5 

5.5 

19.2 

61.1 

53.7 

74.2 

JP 

60 

89 

58.0 

65.3 

47.6 

85.5 

114.9 

72.5 



94 

59.8 

66.4 

48.6 

84.9 

115.6 

78.3 


65 

89 

89.5 

66.0 

56.0 

125.7 

120.5 

104.1 



94 

87.5 

66.9 

57.0 

116.9 

120.8 

107.1 

SW 

60 

89 

70.7 

30.4 

50.7 

57.9 

68.8 

69.8 



94 

76.9 

28.8 

44.0 

57.7 

71.0 

71.5 


65 

89 

109.3 

29.5 

25.1 

58.6 

69.5 

74.6 



94 

115.9 

27.9 

29.1 

58.2 

72.1 

74.1 

UK 

60 

89 

61.8 

70.1 

60.7 

62.5 

63.1 

62.8 



94 

57.3 

70.6 

60.9 

63.7 

63.8 

62.5 


65 

89 

52.0 

70.5 

71.8 

66.5 

64.9 

61.7 



94 

44.0 

71.0 

71.8 

68.4 

65.8 

61.9 

US 

60 

89 

9.6 

9.4 

9.7 

33.5 

22.7 

10.6 



94 

14.4 

9.3 

9.6 

33.0 

23.6 

10.0 


65 

89 

28.6 

9.3 

8.7 

28.7 

21.8 

16.1 



94 

45.3 

9.2 

8.9 

28.5 

23.0 

13.2 


Table 5. MAPEa 65 ,2008,2009 
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1990 1995 2000 2005 2010 

Year 

-9- Observed —H-CBD —X— LC - A — SL 


Figure 6: Rates of MI with respect to 1989, taking x m \ n = 65 and x max = 89, females and x = 65 


casting periods, was not systematic. This finding expresses that the selection of models cannot 
reliably be based on the analysis of in-sample errors, as claimed by authors like e.g. Booth and 
Tickle in jj], although evidently a model should provide a good fit to the historical data. Further 
studies to analyze out-of-sample errors using historical data are required. 

The results obtained in this paper depend on data features, namely length of the fitting year pe¬ 
riod, to, x m j n , and x max . For instance, literature shows that shorter fitting periods would tend to 
work better because they capture the most recent mortality trend (see e.g. [3]). Besides, accord¬ 
ing to our results, the selection of models would be impacted by variations of x m i n , x max , sex, and 
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1990 1995 2000 2005 2010 

Year 

-6- Observed —H-CBD —X— LC - A — SL 


Figure 7: Rates of MI with respect to 1989, taking x m j„ = 65 and x max = 89, males and x-65 

country. This means that all these variables should be considered as parameters in mortality stud¬ 
ies. 
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